Estimation of two-to-one forced selection intelligibility scores by speech recognizers using noise-adapted models
نویسندگان
چکیده
We attempted to estimate subjective scores of the Japanese Diagnostic Rhyme Test (DRT), a two-to-one forced selection speech intelligibility test, using automatic speech recognizers with language models that force one of the words in the wordpair. The acoustic models were adapted to the speaker, and then adapted to noise at a specified SNR. The match between subjective and recognition scores improved significantly when the adapted noise level and the tested level match. However, when SNR conditions do not match, the recognition scores degraded especially when test SNR conditions were higher than the adapted level. Index Terms Objective estimation, speech intelligibility, speech recognition, Japanese DRT, noise adaptation
منابع مشابه
Objective intelligibility assessment of pathological speakers
Intelligibility is a primary measure for the assessment of pathological speech. Traditionally, it is measured using a perceptual test, which is by definition subjective in nature. Consequently, there is a great interest in reliable, automatic and therefore objective methods. This paper presents such a method that incorporates an automatic speech recognizer (ASR) for producing features that char...
متن کاملمدل میکروسکوپی دوگوشی مبتنی بر فیلتر بانک مدولاسیون برای پیش گویی قابلیت فهم گفتار در افراد دارای شنوایی عادی
In this study, a binaural microscopic model for the prediction of speech intelligibility based on the modulation filter bank is introduced. So far, the spectral criteria such as the STI and SII or other analytical methods have been used in the binaural models to determine the binaural intelligibility. In the proposed model, unlike all models of binaural intelligibility prediction, an automatic ...
متن کاملSpeech Enhancement Using Gaussian Mixture Models, Explicit Bayesian Estimation and Wiener Filtering
Gaussian Mixture Models (GMMs) of power spectral densities of speech and noise are used with explicit Bayesian estimations in Wiener filtering of noisy speech. No assumption is made on the nature or stationarity of the noise. No voice activity detection (VAD) or any other means is employed to estimate the input SNR. The GMM mean vectors are used to form sets of over-determined system of equatio...
متن کاملThe effect of redesign workstation on Speech Interference Level (SIL) among bank tellers
Abstract Background: There is always an interaction between man and his environment that can be the cause of physical, physiological and psychological stress on people and also cause discomfort, annoyance, and have direct and indirect effects on their performance and productivity, health and safety. People in their workplace are exposed to many factors related to work activities and environmen...
متن کاملCipher text only attack on speech time scrambling systems using correction of audio spectrogram
Recently permutation multimedia ciphers were broken in a chosen-plaintext scenario. That attack models a very resourceful adversary which may not always be the case. To show insecurity of these ciphers, we present a cipher-text only attack on speech permutation ciphers. We show inherent redundancies of speech can pave the path for a successful cipher-text only attack. To that end, regularities ...
متن کامل